svm: allow conflicting transactions in entries by 2501babe · Pull Request #3453 · anza-xyz/agave

2501babe · 2024-11-04T06:50:39Z

Problem

simd83 intends to relax entry-level constraints, namely that a transaction which takes a write lock on an account cannot be batched with any other transaction which takes a read or write lock on it

before the locking code can be changed, however, svm must be able to support such batches. presently, if one transaction were to write to an account, and a subsequent transaction tried to read from or write to that account, svm would not pass the updated account state to the second transaction; it gets them from accounts-db, which is not updated until after all transaction execution finishes

Summary of Changes

in #3404, we refactored transaction processing to perform all steps for each transaction in sequence, and added the AccountLoader type to enable the future addition of an intermediate account cache for changed states. this pr adds that cache

all code changes are in the first commit, and the commit is delightfully small

im still minorly displeased at how much code we have to copy from account saver collect_accounts_to_store because those functions also need the pre-execution transactions, the post-execution transactions, and accumulate multiple vectors of data. maybe we could make the logic in update_accounts_for_executed_tx et al runtime utility functions and have account saver do two passes through everything, but i think this kind of refactor is out of scope here

all other commits are changes to tests. the vast majority of new lines of code in this pr are new integration tests, but they are basically unchanged from #3146. the only important change to the integration tests are making the framework emulate how zero lamport accounts are deallocated. we dont reuse the code on AccountLoader to guard against refactors introducing bugs in its own logic

2501babe · 2024-11-14T07:37:55Z

+        if let Some(program) = (use_program_cache && is_invisible_read)
            .then_some(())
            .and_then(|_| self.program_cache.find(account_key))
        {
            // Optimization to skip loading of accounts which are only used as
            // programs in top-level instructions and not passed as instruction accounts.
-            Some(LoadedTransactionAccount {
+            return Some(LoadedTransactionAccount {
                loaded_size: program.account_size,
                account: account_shared_data_from_program(account_key, &self.program_accounts)
                    .ok()?,
                rent_collected: 0,
-            })
+            });
+        }


i made this a separate block because once disable_account_loader_special_case activates it is simply deleted. we will also be able to remove program_cache and program_accounts from AccountLoader and they become free-floating variables in transaction_processor.rs again

also an idea i have been toying with after we do that is to simply implement TransactionProcessingCallback for AccountLoader itself so we can pass it to program cache setup, which will happen after account loading. this depends on how feasible it is to restrict callbacks to a concrete type (unless we just make two traits)

2501babe · 2024-11-14T07:40:20Z

+            // If lamports is 0, a previous transaction deallocated this account.
+            // Return None without inspecting, so it can be recreated.
+            if cache_item.account.lamports() == 0 {
+                None
+            } else {
+                Some(cache_item.account.clone())
+            }


while i was getting this pr ready, i realized we dont need to replace zero lamport accounts in the account cache with AccountSharedData::default() at all, so that saves us some Arc::new() calls. we just need something in there that stops us from going to accounts-db, and we can replace it when reading. account_cache is private and load_account() is the only function that reads from it, so we dont need to handle them special anywhere else

2501babe · 2024-11-14T08:28:45Z

+            self.account_cache.insert(
+                *account_key,
+                AccountCacheItem {
+                    account: account.clone(),
+                    inspected_as_writable: is_writable,
+                },
+            );


i left the cache insert unguarded by disable_account_loader_special_case which means it will be used by 2.2 the moment this pr lands. this is safe (@jstarry should confirm):

there is no change to pre-simd83 consensus if disable_account_loader_special_case is not active, because the account cache comes after the program cache, so behavior is identical to if these accounts came from accounts-db

disable_account_loader_special_case is in 2.0, so it is almost guaranteed to activate before the simd83 consensus gate which will relax account locks. but if, for some startling reason, simd83 consensus activates first, it is still fine. there is an edge case where the program cache would shadow the account cache if you close and reopen a loaderv3 buffer. but this only applies if the account is loaded invisible, so it only affects transaction data size, and this is fine because simd83 consensus necessarily can only be activated by its own feature gate

2501babe · 2024-11-14T08:36:36Z

+        if let Some(slot_history) =
+            account_overrides.and_then(|overrides| overrides.get(&slot_history::id()))
+        {
+            // We set `inspected_as_writable` to `true` because this never needs to be inspected.
+            account_cache.insert(
+                slot_history::id(),
+                AccountCacheItem {
+                    account: slot_history.clone(),
+                    inspected_as_writable: true,
+                },
+            );
+        }


after this pr lands, we can contemplate a more restricted interface for passing only this account through. maybe slot_history: Option<AccountSharedData> on the transaction processing environment

bit of an aside, but the sol_get_sysvar syscall we added in 2.0 for core bpf program conversion does not support SlotHistory by design, we only needed StakeHistory and SlotHashes. so the good news is it doesnt introduce a mechanism to bypass this override

the possibly better news is... if neither me nor joe needed to add SlotHistory, do any important bpf programs or builtins actually use it? i have no idea. maybe we dont need to let programs see this sysvar at all if its only used by the validator itself. if so, we could add some #[deprecated] where needed and specialcase it like the instructions sysvar in load_transaction_account to return an account with no data (would require a feature gate, perhaps we could avoid a simd)

2501babe · 2024-11-14T09:07:17Z

+    account_data.set_rent_epoch(u64::MAX);
+    account_data.set_lamports(1);


the test account is read-only, but this is now necessary because svm would put it in the account cache the first time it loads it in an entry and then assume it had been deallocated the second time it loads it. the svm behavior is correct, because there should be no way to bring an account to zero lamports without it being writable, therefore we should never be able to see one in a read-only context in the first place. our builtins and sysvars have dust lamports

apfitzge · 2024-11-14T19:32:23Z

+                slot_history::id(),
+                AccountCacheItem {
+                    account: slot_history.clone(),
+                    inspected_as_writable: true,


I thought there was a discussion on previous PR about this inspected_as_writable stuff.
The function that is called to inspect accounts already has guards against doing it twice.

Is this still necessary?

we never got confirmation on that. ill check with brooks

alright i have clarity on this now, the thing i was missing is that the idea svm has of "what things to inspect" and the idea bank has of "what results of inspection do i actually care about" are disjoint concepts. we can get rid of the state tracking and inspect every time we get an account from accounts-db or account cache. bank has its own dedupe logic and other implementers of inspect can do whatever they want

2501babe · 2024-11-15T12:07:06Z

removed the AccountCacheItem wrapper type. also i marked AccountOverrides::set_account() as private, so it isnt possible to construct invalid instances of it anymore. other than the unit test i removed, nothing in our codebase actually used it other than the public method on AccountOverrides for setting slot history specifically

we can get rid of AccountOverrides in a future pr after we settle on a mechanism to accomplish its purpose. i think the reason its Like That is because its author envisioned other uses that never materialized

Huzaifa696 · 2024-11-18T08:01:12Z

+        // Determine a capacity for the internal account cache. This
+        // over-allocates but avoids ever reallocating, and spares us from
+        // deduplicating the account keys lists.
+        let account_keys_in_batch = sanitized_txs.iter().map(|tx| tx.account_keys().len()).sum();


To avoid allocating every time, we can change to use thread local variable instead, wdyt.
example:

thread_local! { static HAS_DUPLICATES_SET: RefCell<AHashSet<Pubkey>> = RefCell::new(AHashSet::with_capacity(MAX_TX_ACCOUNT_LOCKS)); }

this code is just getting a reasonable upper bound on account cache size. the mechanism in your pr to prevent duplicate transactions in svm isnt required because svm has no concept of duplicate transactions. my understanding is theyre presently implicitly deduped at the account locks level by relying on the fee-payer lock and we just need to do this explicitly with message hashes instead

I am not suggesting here to de-duplicate transactions within a batch. The purpose of this example is to point out a thread_local variable use case to avoid the allocation over-head for account cache in every batch.

Regarding the de-duplication of transactions, when conflicting batches will be allowed the account locking function won't implicitly de-duplicate such transactions. Then as you said we need to explicitly check a batch does not have any duplicate transactions using message hashes.

oh, ty for the suggestion but we dont want to reuse the cache between batches

Huzaifa696 · 2024-11-18T08:12:34Z

Since unified-scheduler on the replay side has been made default scheduler in 2.1 and unified-scheduler is able to support SIMD#83.
Do we still need to add support for SIMD#83 in the old replay path or it will be deprecated by that time?
@apfitzge @ryoqun

Huzaifa696 · 2024-11-18T08:21:41Z

Just for clarification, according to the previous conversation the second PR containing the account cache was envisioned to be feature-gated.
But I think, after some major refactoring of program cache there is no need to feature-gate SIMD#83 SVM changes, is that right?
Only the account locking is feature gated?

2501babe · 2024-11-18T11:37:20Z

Only the account locking is feature gated?

thats correct, the decision to feature gate svm was made because the necessary logic to first go to the account cache and then possibly fall back to the program cache was extremely brittle and had an enormous number of edge cases. but for unrelated reasons we decided to remove (via feature gate) the program cache from account loading entirely and backported it to 2.0, so we were able to simplify the logic here to be much more sound

Huzaifa696 · 2024-11-19T07:59:06Z

we decided to remove (via feature gate) the program cache from account loading entirely and backported it to 2.0, so we were able to simplify the logic here to be much more sound

Sounds good, so if I understand correctly, the disable_account_loader_special_case will de-couple account cache and program cache within some 2.0.x release and then there is no blocker for the SVM changes of SIMD#83 except account locking.

jstarry · 2024-11-23T04:22:22Z

+        if let Some(slot_history) =
+            account_overrides.and_then(|overrides| overrides.get(&slot_history::id()))
+        {
+            account_cache.insert(slot_history::id(), slot_history.clone());


Should we add a debug assert that the size of account_overrides is equal to 1 if sysvar is present and 0 otherwise? Or is it basically impossible now that we decreased the vis of the account overrides set_account function?

it isnt possible, the only exposed setter is set_slot_history now (which accepts AccountSharedData and hardcodes the Pubkey as sysvar::slot_history::id(). we can remove AccountOverrides entirely now as a cleanup. im thinking ill look into it when i start on simd186 since id like to change something about simulation results, and this is related to simulation only

jstarry · 2024-11-23T04:29:41Z

+            self.callbacks
+                .inspect_account(account_key, AccountState::Alive(account), is_writable);


Shouldn't we check if lamports is 0 before saying the account is alive? Otherwise you could read a dead account into the cache and later write to it and even tho it's still dead, we say it's alive.

youre right, ill open a pr for this on monday. thankfully it doesnt affect bank because to get here we would have already had to inspect as writable (which is why this was missed; before removing AccountCacheItem, inspecting a previously-alive now-dead account never happened at all)

edit: actually there is a slight edge case of taking a dead account read-only, leaving it dead, and making it alive, but this is only possible post-simd83 and the fix is the same

edit2: nevermind there actually is no way to get a bad first inspect because to bring the account to zero lamports in the first place it must necessarily be inspected as writable

Hmm isn't there a bug post simd-83 where you read a dead account in one tx and then write to it in another? The inspect for the write tx would say it's alive when it's actually dead

yes, i made my first edit thinking of that, but made the second edit thinking of the reason why we didnt need to check for equality with AccountSharedData::default(). but in this case the account doesnt need to exist

(i have a pr for this now but have been blocked by ci build issues, will add you once its green)

* Generalize naming of vote rewards to reward commissions * Address review feedback: update remaining vote rewards references (#3453)

2501babe force-pushed the 20241103_svmconflicts_attempt6 branch 8 times, most recently from e5f77c3 to 79e6c3c Compare November 9, 2024 20:08

This was referenced Nov 9, 2024

svm: allow conflicting transactions in entries #3146

Closed

svm: interleave transaction validation and processing #3404

Merged

2501babe force-pushed the 20241103_svmconflicts_attempt6 branch 4 times, most recently from f9d86a9 to 1bf6a38 Compare November 14, 2024 00:06

2501babe commented Nov 14, 2024

View reviewed changes

2501babe self-assigned this Nov 14, 2024

2501babe marked this pull request as ready for review November 14, 2024 09:37

2501babe requested review from apfitzge and jstarry November 14, 2024 09:57

2501babe mentioned this pull request Nov 14, 2024

svm: skip appending loaders to loaded accounts #3631

Merged

apfitzge reviewed Nov 14, 2024

View reviewed changes

Comment thread svm/src/account_loader.rs

apfitzge reviewed Nov 14, 2024

View reviewed changes

2501babe added 4 commits November 14, 2024 17:38

svm: allow conflicting transactions in entries

3916671

add integration tests back

cb9ca92

actually these tests are meaningless now

1b8a79c

fix unit tests

9e8eb4b

fix concurrent test

96ef843

2501babe force-pushed the 20241103_svmconflicts_attempt6 branch from 1bf6a38 to 96ef843 Compare November 15, 2024 01:38

apfitzge reviewed Nov 15, 2024

View reviewed changes

Comment thread svm/tests/integration_test.rs Outdated

improve writable lookup in integration tests

755375d

apfitzge approved these changes Nov 15, 2024

View reviewed changes

Huzaifa696 reviewed Nov 18, 2024

View reviewed changes

2501babe merged commit 9116142 into anza-xyz:master Nov 20, 2024

jstarry reviewed Nov 23, 2024

View reviewed changes

github-merge-queue Bot pushed a commit that referenced this pull request Feb 4, 2026

Generalize naming of vote rewards to reward commissions (#10343)

75a41e3

* Generalize naming of vote rewards to reward commissions * Address review feedback: update remaining vote rewards references (#3453)

		account_data.set_rent_epoch(u64::MAX);
		account_data.set_lamports(1);

		self.callbacks
		.inspect_account(account_key, AccountState::Alive(account), is_writable);

Conversation

2501babe commented Nov 4, 2024 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Problem

Summary of Changes

Uh oh!

2501babe Nov 14, 2024 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

2501babe Nov 14, 2024 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

2501babe Nov 14, 2024 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Uh oh!

apfitzge Nov 14, 2024 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

2501babe commented Nov 15, 2024 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Huzaifa696 commented Nov 18, 2024

Uh oh!

Huzaifa696 commented Nov 18, 2024

Uh oh!

2501babe commented Nov 18, 2024

Uh oh!

Huzaifa696 commented Nov 19, 2024

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

2501babe Nov 24, 2024 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants

2501babe commented Nov 4, 2024 •

edited

Loading

2501babe Nov 14, 2024 •

edited

Loading

2501babe Nov 14, 2024 •

edited

Loading

2501babe Nov 14, 2024 •

edited

Loading

apfitzge Nov 14, 2024 •

edited

Loading

2501babe commented Nov 15, 2024 •

edited

Loading

2501babe Nov 24, 2024 •

edited

Loading